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(54) Backup system and method of generating a checkpoint for a database 



(57) A backup system for a database, the backup 
system being operable to; 

store a preceding checkpoint containing the con- 
tents of the database, 

receive at least one transaction log, the at least 
one transaction log identifying changes to the contents 



of the database, 

generate a new checkpoint by merging the pre- 
ceding checkpoint and the at least one transaction log, 

and store the new checkpoint. For faster genera- 
tion of the new checkpoint, the or each transaction log 
is sorted prior to merging the or each transaction log with 
the checkpoint. 
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Description 

Field of the Invention 

[0001] This invention relates to a backup system for 
a database, a data handling system comprising a back- 
up system, and a method of generating a checkpoint for 
a database. 

Background of the Invention 

[0002] To provide a backup system for a database, for 
example to guard against failure of computer on which 
the database is held, it is known to store a separate copy 
of the contents of the database, conventionally referred 
to as a checkpoint. This is of particular importance 
where the database is held in a volatile storage medium, 
particularly in a random access memory (RAM) of a 
computer. Conventionally, new checkpoints are taken at 
intervals, for example after a predetermined time has 
elapsed since the previous checkpoint, or when a suffi- 
cient number of changes have occurred to the database. 
In the event of failure of the computer of loss or corrup- 
tion or damage to the database, the database can be 
restored to the state at the most recent checkpoint. 
[0003] Where the database is very large, for example 
in telecommunication applications where the database 
may be on the order of a gigabyte or more, the process 
of generating a checkpoint can be particularly time con- 
suming, potentially over an hour. Since the process of 
reading the database content and writing the content to 
a suitable storage medium will require use of the 
processing capacity and communication bandwidth of 
the computer on which the database is stored, it is clear- 
ly advantageous to reduce the time spent generating a 
checkpoint. 

[0004] Because the database will have been updated, 
after the checkpoint has been taken, the checkpoint is 
conventionally referred to as "fuzzy" in that it represents 
a past state of the database, that is one which is not 
entirely up to date. To record these updates, it is known 
to generate transaction logs, that is files recording the 
changes to the database since the generation of the 
most recent checkpoint. Transaction logs may be gen- 
erated, written to and closed and new transaction logs 
opened in response to appropriate criteria, for example 
at predetermined time intervals or at a maximum desired 
file size for a transaction log or any other user defined 
criteria. Particularly in the example of telecommunica- 
tion systems, whilst the checkpoint is being generated, 
the computer on which the database is held will still be 
active and so transaction logs may be generated during 
the generation of a checkpoint, as well as subsequent 
to the generation of a checkpoint. 
[0005] When it is desired to rebuild the database, the 
process of rebuilding the database begins by writing the 
most recent checkpoint into memory, and then progres- 
sively updating the in-memory database in accordance 



with the transaction logs. In the example of telecommu- 
nication systems, the process of updating the most re- 
cent checkpoint using the stored transaction logs may 
account for as much as half the time taken by the rebuild 

5 process, with consequent delays in bringing a computer 
back on-line after a failure. It is also known to read the 
checkpoint and transaction logs to generate a copy of 
the database for auditing or management purposes, and 
a similar disadvantages result. 

w [0006] An aim of the present invention is to provide a 
new or improved backup system and/or method of gen- 
erating a checkpoint which overcomes one or more of 
the above problems. 

15 Summary of the Invention 

[0007] According to a first aspect of the invention, we 
provide a backup system for a database, the backup 
system being operable to store a preceding checkpoint 
containing the contents of the database, receive at least 
one transaction log, the at least one transaction log iden- 
tifying changes to the contents of the database, gener- 
ate a new checkpoint by merging the preceding check- 
point and the at least one transaction log and store the 
new checkpoint. 

[0008] The backup system may be operable to sort 
the or each transaction log prior to merging the or each 
transaction log with the preceding checkpoint. 
[0009] The backup system may be operable to re- 
ceive a plurality of transaction logs, wherein the trans- 
action logs are sorted to combine the transaction logs 
prior to merging the transaction logs with the preceding 
checkpoints. 

[001 0] The backup system may comprise a data stor- 
age medium and a memory, wherein the checkpoint is 
stored on the data storage medium and the or each 
transaction log is sorted in the memory. 
[0011] The backup system may be operable to store 
at least one transaction log prior to generating a new 
checkpoint. 

[001 2] According to a second aspect of the invention, 
we provide a data handling system comprising a backup 
system according to the first aspect of the invention and 
a database system, the database system comprising a 
memory, and being operable to store a database in the 
memory, the database system being operable to update 
the database in response to a transaction, record the 
transaction in a log, and transmit the transaction log to 
the backup system. 

[0013] The data handling system may be operable to 
transmit the checkpoint to the database system to re- 
build the database. 

[0014] The backup system may be operable to store 
at least one transaction log after generation of the 
checkpoint and may be operable to transmit the at least 
one transaction log to the database system with the 
checkpoint. 

[0015] The data handling system may comprise a da- 
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ta storage medium wherein a copy of the database is 
stored, the backup system being operable to transmit 
the checkpoint to the management system so that the 
database may be audited and/or the copy of the data- 
base synchronised with the database using the check- 
point. 

[001 6] According to a third aspect of the invention , we 
provide a method of generating a checkpoint for a da- 
tabase, the method comprising the steps of receiving at 
least one transaction log, the at least one transaction 
log identifying changes to the database, and merging 
the transaction log with a preceding checkpoint to gen- 
erate a new checkpoint. 

[001 7] The or each transaction log may be sorted prior 
to the step of merging the or each transaction log with 
the preceding checkpoint. 

Brief Description of the Drawings 

[001 8] An embodiment of the present invention will be 
described by way of example only with reference to the 
accompanying drawings wherein; 

Figure 1 is a diagrammatic illustration of a known 
prior art telecommunication system, 

Figure 2 is a diagrammatic illustration of a telecom- 
munication system embodying the present inven- 
tion, 

Figure 3 is a diagrammatic illustration of a method 
of generating a checkpoint embodying the present 
invention, and 

Figure 4 is a graph illustrating the optimisation of 
the method of Figure 3, 

Detailed Description of the Preferred Embodiments 

[001 9] I n the following description , an embodiment of 
the invention will be described with reference to a tele- 
communications application. It will be apparent to the 
skilled reader however, that the invention described 
herein will be applicable to any appropriate database 
where it is desired that checkpoints be provided, for ex- 
ample in a appropriate real time control system. 
[0020] Referring to Figure 1 as an illustration of a 
known system, a service control point such as an HP 
Open Call (TM) service execution platform (SEP) is 
shown at 10, provided with an appropriate connection 
10a to a signalling network. The SEP 10 comprises a 
service execution platform host 11 provided with an in- 
memory database 12 held in RAM, a service logic exe- 
cution environment 13, appropriate protocol stacks 14, 
an event manager 1 5 and a fault tolerance controller 1 6. 
The SEP 10 further comprises a local data storage me- 
dium, in the present example a disk 1 7. Conventionally, 
a service execution platform will comprise two service 



execution platform hosts 11 in a "mated-pair" configura- 
tion to provide for high availability such that the platform 
continues to operate even in the event of failure of one 
of the service execution platforms 10. 

5 [0021] The in memory database 12 is used to store 
all of the information necessary to provide a service and 
other functions as desired, for example to store and pro- 
vide call information and billing information and any oth- 
er information as desired. The database is held as an 

10 in-memory database 1 2 for speed of access. To provide 
for recovery of the contents of the in-memory database 
12, at least one checkpoint 18 and a plurality of trans- 
action logs 1 9 are stored on the local disk 1 7. To gener- 
ate a checkpoint 1 8, the contents of the in-memory da- 

15 tabase 1 2 are copied to the local disk 1 7 conventionally 
at a rate of 1 megabyte per second. Since a in-memory 
database 12 can be on the order of a gigabyte or more, 
this is necessarily time consuming, for example taking 
up to an hour or so. Checkpoints are usually made every 

20 two to four hours, whilst updates are recorded continu- 
ally in the transaction logs 19. Each transaction log is 
closed and a new log opened depending on chosen cri- 
teria, for example at predetermined time intervals or the 
desired size of the log file. During checkpoint generation 

25 a proportion of the SEP hosts 1 1 processing power and 
communication bandwidth will be taken up with trans- 
mission of the contents of the in-memory database 12 
to the disk 17. 

[0022] For management and auditing purposes, it is 

30 known to provide a further system, in the present exam- 
ple a service management platform (SMP) 20. The SMP 
20 comprises a data storage medium 21 on which cop- 
ies of a number of different in-memory databases 12 of 
different SEP's 1 0 are held, and an input/output control- 

35 ler 22 to communicate with the SEP 1 0. Because the in- 
memory database 1 2 and the copy held on the data stor- 
age medium 21 should be synchronised, changes are 
propagated "down" from the SMP to the SEP as shown 
by arrow 23a and "up" from the SEP 10 to the SMP 20 

^o as shown by arrow 23b via the input/output controller 
22. Periodically, the copy of the database held on the 
data storage medium 21 is audited by comparing the 
content with the contents of the in-memory database 1 2 
which is time consuming and similarly draws on the 

45 processing and bandwidth resources of the SEP host 
11. 

[0023] Referring now to Figure 2, a service execution 
platform is shown at 110 similar to the SEP 10. In like 
manner, the SEP 110 comprises an in-memory data- 

50 base 1 1 2, a service logic execution environment 1 1 3, a 
plurality of network stacks 114, a event manager 115 
and a fault tolerance controller 116. The SEP 110 has 
an appropriate connection 117 to a signalling network. 
A service management platform is shown at 1 20 similar 

55 to the SMP 20, comprising a digital storage medium 121 
and an input/output controller 122. A backup system is 
shown at 1 24, comprising a data storage medium, in the 
present example a disk 125, and a memory 126. Stored 
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on the disk 125 is at least the most recent checkpoint 
118 and a plurality of transaction logs 119. In practice, 
_at least the two most recent checkpoints and associ- 
ated logs will be stored on the disk 125. 
[0024] The S EP 1 1 0 and backup system 1 24 operate 
as follows When the in-memory database 112 is updat- 
ed, for example as a result of network messages re- 
ceived over the connection 1 1 7, the SEP 1 1 0 will record 
the update or "transaction" on a file, or transaction log, 
and transmit it to the backup system 1 24 as shown by 
arrow 127. The transaction log 119 will be recorded in 
the disk 125. The transaction log 11 9 will contain one or 
more updates, each update comprising the old data, the 
new data and information identifying the database loca- 
tion which has been updated, in particular a table iden- 
tifier and a row key. Each transaction log will further con- 
tain an update serial number which provides a unique 
identifier for each update. The number of updates stored 
in a single transaction log 119 may be selected as dis- 
cussed below. 

[0025] When it is desired to establish a new check- 
point, the backup system 124 will operate as shown in 
Figure 3. The backup system 1 24 will read all the trans- 
action logs T r T m as shown at 1 1 9 into memory and sort 
the transaction logs T-) to T m into temporary files for ef- 
ficient merging as shown at step 128. Where the in- 
memory database 112 comprises a relational database, 
the transaction logs will identify the database location 
using one or more table identifiers and row keys, and 
this sorting process may advantageously and efficiently 
be performed by sorting each transaction listed in the 
transaction logs T 1 to T m by the appropriate table iden- 
tifiers, row keys, and update serial number. The sorting 
by update serial number is desirable because the same 
location may have been updated more than once and 
sorting the transactions by update serial number will en- 
sure that the transactions are applied to the location in 
the correct order. The most recent checkpoint C n as 
shown at 118 is then merged with the sorted transaction 
logs T^ - ~r m shown at 11 9' at step 129 to generate a 
new checkpoint C n+1 as shown at 130. This updated 
checkpoint is then stored on the local storage medium 
125. The step 128 of sorting the transaction logs T.|-T m 
and checkpoint C n is preferably performed in the mem- 
ory 126 to speed up the sorting process and not per- 
formed by reading and writing to the storage medium 
125. Advantageously, the transaction logs T r T m will be 
effectively merged by the sorting step 1 28 so that the 
merge step 1 29 consists simply of writing the combined 
content of the transaction logo T r T n to the checkpoint 

[0026] After generation of the checkpoint C n +1 , the 
transaction logs T.,-T m and temporary sorted transac- 
tion logs T 1 -T , m may be discarded. 
[0027] The backup system 124 and SEP 110 may be 
initialised together, such that the in-memory database 
112 is initially empty and the first checkpoint C 0 , is a null 
file. Alternatively, C 0 may comprise a checkpoint made 



in conventional manner by writing the initial contents of 
the in-memory database 1 1 2 to the backup system 1 24. 
[0028] It will thus be apparent that, in accordance with 
the present invention, the only call on the processing 

5 and bandwidth capacity of the SEP 110 is that neces- 
sary to transmit a transaction log to the backup system 
1 24. The new checkpoints C n+1 is generated simply by 
updating the most recent checkpoint C n in view of the 
transaction logs 119. The method of generating a new 

10 checkpoint is entirely performed by the dedicated back- 
up system 124, thus speeding up the process of gener- 
ating the checkpoint and not demanding any of the 
processing or bandwidth capacity, of the service execu- 
tion platform 110. The up to date checkpoint C n+1 is then 

15 available for use as may be desired, for example to re- 
store the in memory database 112 by being transmitted 
to the SEP 110 along with any recent transaction logs 
1 1 9 to the SEP as shown by arrow 1 27, or for transmis- 
sion to, for example, the service management platform 

20 1 20 as shown by arrow 1 3 1 for the purposes of auditing 
the in-memory database 1 1 2 or synchronising the data- 
base copy held on the storage medium 121 . The proc- 
ess of generating the checkpoint may be performed rel- 
atively frequently compared to known methods to mini- 
ms mise the number of transaction logs required. It will be 
apparent that when it is necessary to, for example, re- 
store the in-memory database 112, the most recent 
checkpoint C n+1 will either be up to date or almost up to 
date, and that a relative quick recovery process will be 

30 performed. 

[0029] It will be possible to optimise the number and 
size of the transaction logs 1 1 9 held on the data storage 
medium 1 25. Where the transactions are stored in a rel- 
atively large number of relatively small files, the sorting 

35 step 1 28 will be faster because there will be more, rel- 
atively quick sorting operations performed in the mem- 
ory 126 and fewer steps of reading the disk 125 which 
are relatively slow. However, with a greater number of 
files, the time taken for the merge step 129 will increase 

40 as it will be necessary to read the disk 125 more often 
to retrieve more files. This trade off is illustrated in Figure 
4, where the X axis shows the number of transactions 
or files into which the transactions are stored, the Y axis 
shows the time taken to perform the sorting and merging 

45 steps, line 1 32 shows the time taken for the sorting op- 
eration, line 133 shows the time taken for the merge op- 
eration 1 30 and line 1 34 shows the total time for the sort 
step 1 28 and the merge step 1 29. It will be apparent that 
there is an optimum minimum at point 135. 

so [0030] It will be apparent that the backup system 1 24 
may advantageously be separate from the SEP 110. In- 
deed, the backup system 124 may be physically re- 
moved from the SEP 110. Alternatively, the separation 
of the backup system 124 and SEP 110 may be "virtual", 

55 that is the backup system 1 24 resides on the same com- 
puter as the SEP 1 1 0 but uses dedicated resources, for 
example a dedicated CPU. Separation of the backup 
system 124 from the SEP 110 ensures that checkpoint 
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generation does not use any of the bandwidth of the 
SEP CPU, or any of the bandwidth required to access 
the in-memory database 1 1 2, leaving it available for use. 
[0031 ] For faster recovery of the in memory database 
112, it maybe advantageous to have the most recent 
checkpoint available on the SEP 110, for example by 
providing a shared disk between the backup system 1 24 
and SEP 110. 

[0032] In the method of Figure 3, where there are 
many logs T 1 to T n , the process of sorting all of the logs 
at stage 128 and merging all the logs with the previous 
check point at step 1 29 allows the merge step to be per- 
formed with a read stage to read the previous check 
point once from the disk 125 and a single write phase 
to write the new checkpoint to the disk 1 25. The number 
of transaction logs 1 1 9 maintained on the disk 1 25 may 
be much smaller than those stored on conventional sys- 
tems and new checkpoints may be generated more fre- 
quently. 

[0033] It is known that on some computers, where a 
CPU is instructed to write large volumes of data, that 
CPU is then unavailable for any other operation. In this 
case, and writing small transmission logs to the backup 
system 124, the CPU of the SEP 10 is made available 
for other operations and is not blocked in such a manner. 
[0034] In the present specification "comprises" 
means "includes or consists of and 
"comprising" means "including or consisting of". 
[0035] The features disclosed in the foregoing de- 
scription, or the following claims, or the accompanying 
drawings, expressed in their specific forms or in terms 
of a means for performing the disclosed function, or a 
method or process for attaining the disclosed result, as 
appropriate, may, separately, or in any combination of 
such features, be utilised for realising the invention in 
diverse forms thereof. 



Claims 

1 . A backup system for a database, the backup system 
being operable to; 

store a preceding checkpoint containing the 
contents of the database, 

receive at least one transaction log, the at 
least one transaction log identifying changes to the 
contents of the database, 

generate a new checkpoint by merging the 
preceding checkpoint and the at least one transac- 
tion log, 

and store the new checkpoint. 

2. A backup system according to claim 1 operable to 
sort the or each transaction log prior to merging the 
or each transaction log with the preceding check- 
point. 

3. A backup system according to claim 2 operable to 



receive a plurality of transaction logs, and wherein 
the transaction logs are sorted to combine the trans- 
action logs prior to merging the transaction logs with 
the preceding checkpoint. 

5 

4. A backup system according to claim 2 or claim 3 
comprising a data storage medium and a memory, 
wherein the checkpoint is stored on the data stor- 
age medium and the or each transaction log is sort- 

10 ed in the memory. 

5. A backup system according to any one of the pre- 
ceding claims operable to store at least one trans- 
action log prior to generating a new checkpoint. 

15 

6. A data handling system comprising a backup sys- 
tem according to any one of the preceding claims 
and a database system, the database system com- 
prising a memory, and being operable to store a da- 

20 tabase in the memory, the database system being 
operable to update the database in response to a 
transaction, record the transaction in a log, and 
transmit the transaction log to the backup system. 

25 7. A data handling system according to claim 6 where- 
in the backup system is operable to transmit the 
checkpoint to the database system to rebuild the 
database. 

30 8. A data handling system according to claim 7 where- 
in the backup system is operable to store at least 
one transaction log after generation of the check- 
point and wherein the backup system is operable to 
transmit the at least one transaction log to the da- 
35 tabase system with the checkpoint. 

9. A data handling system according to claim 6 or 
claim 7 or claim 8 further comprising a management 
system, the management system comprising a data 

40 storage medium wherein a copy of the database is 
stored, the backup system being operable to trans- 
mit the checkpoint to the management system. 

10. A method of generating a checkpoint for a data- 
45 base, the method comprising the steps of; 

receiving at least one transaction log, the at 
least one transaction log identifying changes to the 
database, and 

merging the transaction log with a preceding 
so checkpoint to generate a new checkpoint. 

1 1 . A method according to claim 1 0 comprising the step 
of sorting the or each transaction log prior to the 
step of merging the or each transaction log with the 

55 preceding checkpoint. 
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